Implementation Experiences in Transparently Harnessing Cluster-Wide Memory

نویسندگان

Michael R. Hines

Mark Lewandowski

Jian Wang

Kartik Gopalan

چکیده

There is a constant battle to break even between continuing improvements in DRAM capacities and the growing memory demands of large-memory high-performance applications. Performance of such applications degrades quickly once the system hits the physical memory limit and starts swapping to the local disk. In this paper, we investigate the benefits and tradeoffs in pooling together the collective memory resources of nodes across a high-speed LAN based cluster. We present the design, implementation and evaluation of Anemone – an Adaptive Network Memory Engine – that virtualizes the collective unused memory of multiple machines across a gigabit Ethernet LAN, without requiring any modifications to the large memory applications. We have implemented a working prototype of Anemone and evaluated it using real-world unmodified applications such as ray-tracing and large in-memory sorting. Our results with Anemone prototype show that unmodified single-process applications execute 2 to 3 times faster and multiple concurrent processes execute 6 to 7.7 times faster, when compared to disk based paging. The Anemone prototype reduces page-fault latencies by a factor of 19.6 – from an average of 9.8ms with disk based paging to 500μs with Anemone. Most importantly, Anemone provides a virtualized low-latency access to potentially “unlimited” memory resources across the network.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Anemone: Transparently Harnessing Cluster-Wide Memory

متن کامل

Fast Transparent Cluster-Wide Paging

In a cluster with a very low-latency interconnect, the remote memory of nodes can serve as a storage that is faster than local disk but slower than local memory. In this paper, we address the problem of transparently utilizing this cluster-wide pool of unused memory as a low-latency paging device. Such a transparent remote memory paging system can enable large-memory applications to benefit fro...

متن کامل

The implementation of the em * multi - microprocessort by RICHARD

The implementation of a hierarchical, packet switched mUltiprocessor is presented. The lowest level of the structure, a Computer Module, is a processor-memory pair. Computer Modules are grouped to form a cluster; communication within the cluster is via a parallel bus controlled by a centralized address mapping processor. Clusters communicate via intercluster busses. A memory reference by a prog...

متن کامل

KIMP: Multicheckpointing Multiprocessors

Multiprocessors are coming into wide-spread use in many application areas, yet there are a number of challenges to achieving a good tradeoff between complexity and performance. For example, while implementing memory coherence and consistency is essential for correctness, efficient implementation of critical sections and synchronization points is desirable for performance. The multi-checkpointin...

متن کامل

OS Experimentation and a User Community Coexist Under the DUnX Kernel

The class of NUMA (nonuniform memory access time) shared memory architectures is becoming increasingly important with the desire for larger scale multiprocessors. In such machines, the placement and movement of code and data are crucial to performance. The operating system can play a role in managing placement through the policies and mechanisms of the virtual memory subsystem. An implementatio...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2006

Implementation Experiences in Transparently Harnessing Cluster-Wide Memory

نویسندگان

چکیده

منابع مشابه

Anemone: Transparently Harnessing Cluster-Wide Memory

Fast Transparent Cluster-Wide Paging

The implementation of the em * multi - microprocessort by RICHARD

KIMP: Multicheckpointing Multiprocessors

OS Experimentation and a User Community Coexist Under the DUnX Kernel

عنوان ژورنال:

اشتراک گذاری